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THRESHOLD  DEPENDENT  ROBUST  DISCRIMINATION 
FOR  CONVEX  PROBABILITY  UNCERTAINTY  CLASSES 


1.  INTRODUCTION 

Robust  binary  signal  discrimination  is  concerned  with  finding  detection  structures  whose  performance 
measures  over  an  input  class  (or  classes)  are  nontrivally  lower  and/or  upper  bounded.  Normally  the 
underlying  probability  measures  for  the  binary  hypotheses  are  defined  in  terms  of  uncertainty  or 
neighborhood  classes.  The  detector  performance  measures  can  be  false  alarm  probability,  detection 
probability,  risk,  output  signal-to-noise  (S/N)  power  ratio,  or  deflection. 

It  was  proven  by  Strassen  [1,  2]  for  finite  spaces  and  then  by  Huber  and  Strassen  [3]  for  Polish 
spaces  that  the  Neyman-Pearson  lemma  generalizes  for  uncertainty  classes  that  can  be  characterized  as 
Choquet’s  2-alternating  capacities  [4,  5].  Let  fi  be  a  Polish  space  and  let  ^  stand  for  the  u-Borel  field 
on  Q.  By  M  we  denote  the  set  of  all  probability  measures  on  The  2-alternating  capacity  used  by 
Huber  and  Strassen  [3]  can  be  defined  as  a  set  function  rj  from  ,^to  [0,  1]  which  is  the  upper  probability 
of  a  weakly  compact  set  of  probability  measures,  and  it  satisfies  the  condition  r](A  U  B)  +  r](A  D  B)  < 
rj(A)  +  7j(B)  for  all  A,  B  E:  k  set  3P  of  all  probability  measures  majorized  by  rj,  i.e.  T  =  {P  E  M 

:  P(A)  <  r](A),  for  all  A  E  is  said  to  be  generated  by  i|. 

The  Huber-Strassen  results  were  further  extended  in  terms  of  special  capacities  by  Rieder  [6]  and 
Bednarski  [7],  and  general  capacities  by  Vastola  [8].  Results  for  specific  uncertainty  classes  are  given 
by  Huber  [9,  15],  Kassam  [10],  and  Vastola  and  Poor  [11].  All  of  the  above  results  pertain  to  the  signal 
discrimination  problem  whereby  each  hypothesis  is  characterized  by  non-overlapping  uncertainty  classes. 
Other  approaches  consider  the  generalized  signal-to-noise  ratio  (cf.  [12,  13])  as  a  performance  measure. 

For  all  of  the  results  previously  cited,  the  robust  test  between  the  two  composite  hypotheses  reduces 
to  a  test  between  two  simple  hypotheses  whereby  the  underlying  probability  measures  are  fixed 
representatives  of  the  specified  uncertainty  classes.  The  representative  measures  are  independent  of  the 
test’s  threshold.  In  many  cases  the  resultant  test  is  a  censored  version  of  a  nominal  likelihood  ratio. 
Hence,  arbitrarily  low  false  alarm  probabilities  cannot  be  attained  without  a  trivial  randomization  of  the 
decision  rule.  In  this  paper,  we  develop  a  new  class  of  robust  discriminators  whereby  the  solutions  are 
threshold  dependent  (or  for  short,  T-dependent).  Specifically  we  are  looking  for  decision  rules  such  that 
if  the  threshold  of  the  rule  is  specified,  then  the  Bayes  risk  of  the  detector  is  sharply  upper-bounded  over 
given  input  uncertainty  classes.  By  sharp,  we  mean  that  there  is  at  least  one  pair  in  the  hypotheses’ 
uncertainty  classes  for  which  the  upper-bound  is  attained.  It  is  in  this  sense  that  we  define  robustness. 
For  this  development,  the  support  of  the  random  variables  is  assumed  to  have  a  finite  number  of 
elements. 

The  basic  motivation  for  finding  the  robust  T-dependent  solutions  is  to  provide  a  mechanism  for 
generating  robust  solutions  for  uncertainty  classes  that  are  not  necessarily  2-alternating  capacitable.  If 
certain  conditions  are  satisfied,  we  will  find  that  the  uncertainty  classes  need  not  be  2-alternating 
capacitable  in  order  for  robust  solutions  to  exist.  It  will  be  shown  that  the  robust  discrimination  solution 
is  again  given  by  a  fixed  representative  pair  of  simple  hypotheses. 

This  paper  is  organized  as  follows.  In  Section  II,  we  formulate  the  discrimination  problem  and 
summarize  an  earlier  formulation  due  to  Huber  [8]  and  Poor  [12].  In  Section  III,  the  T-dependent  robust 
solutions  for  signal  discrimination  are  formulated  as  solutions  of  a  particular  minimization  problem  similar 
to  that  developed  by  Huber  and  Strassen  [3].  Formulations  and  conditions  for  solutions  for  specific 
uncertainty  classes  (the  divergence  class  and  its  generalization,  the  divergence/linear  class)  are  given  in 
Sections  IV  and  V. 
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II.  PRELIMINARIES 


Let  (X,  be  a  measurable  space,  and  let  P^,  P^  be  distinct  probability  measures  on  it.  Assume  that 
/*o  and  P|  are  members  of  two  disjoint  classes,  Tq  and  T,,  respectively,  of  possible  distributions  on  (X, 

9^,  and  that  Pq  and  P,  are  convex  probability  classes,  i.e.  if  P^,  P,  €  P^  then  (1  -  v)P.  +  vP^  G  P, 
for  0  ^  V  ^  1,  i  =  0,  1.  Let  P,(i  =  0,1)  have  density  p,  with  respect  to  some  measure  p  and  assume 
that  /t  >  >  Pq,  P,  and  Pq  >>  Pi  for  all  P,  €  3*  (/  =0,1).  For  this  space,  X  is  the  set  of  possible 
observations  and  the  support  of  X  has  a  finite  number  of  elements  and  is  denoted  by  0^.  9  is  the 
a-algebra  of  possible  observation  events.  In  addition,  let  x  =  {jt„,  «  =  1,  2,  ...,  N)  be  a  sequence  of 
complex  (X  C  identically  distributed  (but  not  necessarily  independent)  random  variables  (r.v.’s) 
defined  on  (X,  9). 

On  the  basis  of  observing  the  vector  X  =  (X,,  Xj,  ...,  Xf/Y  where  T  denotes  transpose  and  X,  is  the 
realization  of  r.v.,  x„.  We  wish  to  decide  between  the  following  pair  of  hypotheses  concerning  X, 


H^:  X  ~  P„  G  Po 
H,:  X  ~  P,  G  P, 


(2.1) 


where  X  ~  P  indicates  that  the  observation  vector  X  is  distributed  according  to  the  distribution  P. 

Let  0  be  any  test  between  Pq  and  P,  accepting  P,  with  conditional  probability  0(X)  given  that  X  has 
been  observed.  Assume  that  a  cost  C)  is  incurred  only  if  H,  is  falsely  rejected  (i  =  0,  1).  The  expected 
costs,  or  risks  are  given  by 


R(P„<t>)  =  C„  I  H„  true}, 


(2.2) 


=  Cq  Prob  (0  accepts  H,  |  true}, 


P(P,,0)  =  C,  £{1  -  0  I  H,  true}. 


(2.3) 


=  C,  Prob  (0  accepts  |  H,  true}. 


where  E  and  Prob  denote  expectation  and  probability,  respectively.  Consider  the  following  minimax 
testing  problems: 


min  max  P(P,,  0)  subject  to  max  R{Pq,  0)  <  a  (2.4) 


and 


min  max  <f>)  *  x,P(P,,  0)]  (2.5) 

*€£) 
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where  ir,  =  Prob  {H,}  occurs,  3*  =  Tb  D  denotes  the  class  of  all  randomized  decision  rules. 

The  problems  described  by  (2.4)  and  (2.5)  are  the  minimax  Neyman-Pearson  and  Bayes  hypothesis  testing 
criteria,  respectively. 

When  the  measures  P,  and  their  respective  densities,  are  known,  then  the  optimal  decision  rule 
for  both  of  the  above  problems  is  given  by  the  likelihood  ratio  test  [14]: 


m  = 


1,  p,(X)/p,(X)  >  T 

7,  p,(X)/po(X)  =  T 

0,  p,(X)lp,(X)  <  T 


(2.6) 


where  the  randomization  parameter  y  and  threshold  Tare  chosen  to  achieve  the  desired  risk  performance. 
For  the  Bayes  criteria,  7  =  0. 


When  Po,  P,  are  not  known  but  are  members  of  the  disjoint  uncertainty  classes  Tq.  respectively, 
then  Huber  and  Strassen  [3]  have  shown  that  if  the  composite  hypotheses  can  be  described  in  terms  of 
alternating  capacities  of  order  2,  then  the  minimax  problems  given  by  (2.4)  and  (2.5)  are  solved  by  an 

ordinary  test  between  a  fixed  representative  pair  Pq,  P,  of  simple  hypotheses  where  P,  €  T;.  (i  =  0,  1). 
Note  for  their  development  the  condition,  Po  >  >  Pi,  was  not  used,  the  support  space  need  not  have  a 
finite  number  of  elements,  and  that  Pq,  Pj  are  independent  of  the  threshold,  T.  In  certain  cases  they  also 
showed  that  the  alternating  capacity  condition  is  necessary. 


In  our  development,  we  wish  to  define  a  new  class  of  robust  detectors  which  depend  upon  the 
threshold,  T.  If  certain  conditions  are  satisfied,  we  will  find  that  the  alternating  capacity  condition  is  not 
necessary  in  order  for  robust  solutions  to  exist  and  that  again  the  robust  solutions  are  given  by  a  fixed 
representative  pair  of  simple  hypotheses.  Our  performance  measure  for  optimality  is  Bayes  risk. 


To  this  end,  we  restrict  ^  to  take  the  following  form  for  Bayes  tests  for  some  (P^,  P,)  G  x  P, 


m  = 


1  p,(X)/po(X)  >  T 
0  p,(X)/p,(X)  <  T. 


For  a  given  threshold  T  we  define  the  P-dependent  risks: 

P(Pp,  T)  =  Cq  Prob  (0  accepts  H,  |  true,  7} 


(2.7) 


Hg  true,  7} 

(2.8) 

H,  true,  P} . 

(2.9) 

P(P,,  ^  ,T)  =  Prob  {(j)  accepts  |  H,  true,  P} . 

Measures  that  can  be  directly  associated  with  the  risks  are  the  probabilities  of  detection  (the  power  of  the 
test)  and  false  alarm  (the  size  of  the  test)  which  we  denote  by  P^  and  P^,  respectively.  These  are  defined 
as 


Pp(P,,  P)  =  Prob  {(f)  accepts  H,  j  H,  true,  P}, 


(2.10) 
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P/Pg,  7)  =  Prob  (0  accepts  H,  |  Hq  true,  T). 


(2.11) 


In  addition  the  probability  of  a  missed  detection  is  defined  by 

PjiP,,  0,  D  =  1  -  T).  (2.12) 

Let  0  be  the  likdihood  ratio  associated  witfi  a  given  pair  (Pq,  P,)  G  !P.  For  aibitraiy  input  pair  (P^,  P,)  €  T, 
the  Bayes  risk  is  defined  as 

R,{P,,  P„  0,  r  )  =  Co  P/Po.  0,  D  4  X,  C,  P„(P,,  0.  P)  (2.13) 

where 


T  =  (2.14) 

T,C, 

For  a  given  T,  we  desire  to  find  a  0  associated  with  the  pair  (P^,  P,)  G  tP  such  that  the  following 
bounding  condition  is  satisfied: 

P,(Po,  P,,  0,  7)  ^  Po(Po.  P,,  0,  7)  (2.15) 

for  ail  Pq.  P|  G  P. 

Definition:  For  a  given  P,  a  pair  (Pg,  P,)  is  called  least  favorable  in  terms  of  risk  and  P-dependence  with 

respect  to  the  hypothesis  test  (2.1),  if  (2.15)  is  satisfied  where  0  is  associated  with  P^,  P,  and  P 
according  to  Eq.  (2.7).  The  pair  is  also  called  the  least  favorable  P-dependent  pair. 

From  Bayes  risk  theory,  the  least  favorable  P-dependent  pair  also  satisfies  the  following  inequality; 

P,(Po,  P,.  0.  P)  >  R,(P„  P„'^,D  (2.16) 


where  0  is  arbitrary  [12].  The  inequalities  given  by  (2.15)  and  (2.16)  indicate  that  (P^,  P,)  is  a  saddle 
point  solution  of  (2.5). 

III.  ROBUST  SOLUTIONS  VIA  A  MINIMIZATION  PROBLEM 

A.  Minimization  Problem  Definition 

Huber  and  Strassen  [3]  showed  that  the  least  favorable  pair  associated  with  the  discrimination 
problem  for  composite  hypotheses  that  can  be  described  as  2-alternating  capacities  can  be  characterized 
as  the  solution  of  an  integral  minimization  problem.  This  characterization  (or  a  modification  of  the  form 
given  in  [10])  also  has  been  found  for  pdf  banded  classes  and  for  other  optimization  criteria  where  a 
minimax  solution  is  desired  (cf.  [12]).  In  this  section,  we  show  for  spaces  with  a  finite  number  of 
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elements  that  the  least  favorable  T-dependent  pair  can  also  be  characterized  as  the  solution  of  a 
minimization  problem. 

For  not  necessarily  finite  support  spaces,  consider  the  functional  defined  by 


APo,  Pi) 


f  ^ 

dp, 

]x 

dPr. 


(3.1) 


where  Pq,  F,  have  been  previously  defined,  dP^IdP^  is  the  Radon-Nikodym  derivative  of  P,  with  respect 
to  Po,  and  P  is  a  convex  function  such  that  its  domain  and  range  are  in  R*  (J  {0} .  Set  P  =  Pq  ^ 
Consider  the  problem  of  finding 


min  y(Po,  P,).  (3.2) 


Under  the  assumption  that  the  probability  densities  of  P/i  =  0,  1)  exist  and  are  denoted  by  /?„  then 
equation  (3.1)  can  be  rewritten  as 


f  ^ 

El 

J  X 

Po 

Po  dii 


(3.3) 


where  J{P^,  P,)  =  J(p^,  p,).  Let  T;.(  /  =  1,2)  denote  the  sets  of  probability  densities  associated  with 
7..  We  see  that  the  minimization  problem  of  (3.2)  is  equivalent  to  the  problem: 


min  y(Po,  p,)  (3.4) 


where  p  =  Po  x  P\-  We  note  that  p  is  a  bounded  subset  of  space  L‘[/i]  x  and  by  using  Jensen’s 

inequality,  J  is  bounded  from  below  by  P(l).  Hence  we  have  a  problem  of  minimizing  a  convex 
functional  that  is  bounded  from  below  over  a  bounded  subset  of  a  Banach  space.  Results  related  to  the 
existence  of  this  minimum  can  be  found  in  [16-18]  and  in  particular  if  p  is  compact  via  the  Weierstrass 
theorem.  Using  a  result  in  Poor  [12],  we  can  prove  the  following  existence  result  ifp  is  compact,  but  J 
is  not  necessarily  convex. 

Theorem  1:  Suppose  the  class  J*,  (J  P,  is  dominated  by  a  u-finite  measure  fi  on  (X,  ^  and  that 


^  >  0  (3-5) 

dfi 

almost  everywhere  (a.e.)  [p]  for  all  (Pq,  Pi)  C  P.  If  p  is  a  compact  subset  of  L'’[/x]  x  U’ifi]  for  some  p 
>  1  (note  I  |(Po,  P,)|  1  =  I  IPoI  I  +  I  Ipj  I )-  ‘he  functional  J(po,  p^)  achieves  a  minimum  onp. 

Proof:  The  proof  is  a  slight  modification  of  the  proof  given  by  Poor  for  his  Theorem  2  [12].  For  his 
Theorem  2,  P(p,/po)  =  (pJPof-  If  we  substitute  P(p|/po)  for  (p,/po)^  in  his  proof,  all  the  conclusions 
remain  the  same  and  Theorem  1  follows.  □ 
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We  further  restrict  F  to  be  monotonically  increasing,  twice  differentiable,  greater  than  or  equal  to 
zero,  and  F"  ^  0  (F”  denotes  the  second  derivative  of  F).  Further  restrictions  will  be  placed  on  Fas 
our  development  proceeds. 

We  set 


Pov  =  -  ^)Po  +  'Po* 

Ply  =  (1  -  v')F,  +  vp,,  (3.7) 


where  0  ^  v  ^  f;  Po>  ^  Pv  P\  ^  Pv  convexity  (pov.  Piv)  ^  P-  Because 

the  support  space  of  X  has  a  finite  number  of  elements,  henceforth  we  represent  all  integrations  over 
or  subsets  of  0^  as  summations.  Define  the  following  scalar  functions  of  v  on  [0,  1] 


W  =  E  p 


El 

Po, 


Pov* 


=  E  p 

0. 


El 

Po 


Po- 


=  E  P 


Poy 


Poy  ■ 


Lemma  1:  Hq,  Hi,  are  convex  functions  of  v  on  [0,  1]. 
Proof:  It  is  straightforward  to  show  that 


(3.8) 


(3.9) 


(3.10) 


dv 


=  E  (Po  -  Po) 


F 

P. 

-  El  F' 

Pi 

Pov 

Poy 

Pov 

(3.11) 


p,(Po  -  Po)' 

V  pi 


El 

Pov 


>  0. 


(3.12) 


dH^ 

dv 


=  E  (Pi 


Pi)  F' 


El 

Po 


(3.13) 
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d%  .  ^  (P,  -  p..  Pu  >  n 

,_9  Z-/  A 


(3.14) 


-^  =  E  (Po  -  Po)  ^  — 
dv  X  Pov 


Plv  p,  P\v 


Ov  r(Xi 


(3.15) 


•  (p,  -  PP  r'  ^ 


d%  ^  y  (p^Po  -  PA)'  pn  ^  >0. 

dv'^  a  nl.  Pov 


(3.16) 


where  F'  is  the  first  derivative  of  F.  The  results  of  (3.12),  (3.14),  and  (3.16)  verify  the  lemma.D 
Because  H2  is  a  convex  function  of  v  it  follows  that  J  is  convex  on  p. 

B.  Minimization  Solution  Convergence 

A  useful  functional  form  for  F  is  now  defined  which  allows  us  to  obtain  our  results.  Define  the 
function  G  on  R*  |J  {0}  as 


G{z)  =  zF'iz)  -  F{z). 


(3.17) 


It  can  be  shown  that 


F{z)  -  z  j; 


(3.18) 


G'{z)  =  zF"{z). 


(3.19) 


Set  G  =  G^z,  7)  and  define  the  function 
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GS.z,T)  = 


0 


for  z  €  n,  s  [0.  7) 

T)  for  z  €  0,  s  [T,  T  +  0 


(3.20) 


€ 


1 


for  z  G  Oj  s  [r  +  f,  oo] 


where  the  sets  Q^,  Q,,  and  are  defined  in  the  equation.  For  this  characterization,  G^(z, 7)  -►  «(z  -  7) 
as  6  i  0  where  u  is  the  Heavyside  step  fiinction.  For  this  (z,7),  we  can  write  F,(z,7)  using  (3.18) 
explicitly  as 


FSZ,T) 


0 

-  fz  In  4  -  (z  -  7) 
e  r 


z  G  flo 
z  G  tl, 


z  G  Q,. 


(3.21) 


Using  (3.21),  it  is  straightforward  to  show  that  F,  is  twice  differentiable  with  respect  to  z,  greater  than 
or  equal  to  zero,  and  monotonically  increasing.  However  F/'  is  not  continuous.  Define 


Fo(^,T) 


0 

1 

T 


(Z  -  T) 


,  z  <  T 
,  Z  >  T. 


(3.22) 


We  will  need  the  following  two  lemmas  for  our  development; 

Lemma  2:  F,(z,  7)  converges  uniformly  on  (0,  z„uJ  to  Fo(z,  7)  as  €  i  0  where  0  <  z,„^,  <  oo . 
Proof:  See  Appendix  A. 


Lemma  3:  Set  Hfz,T)  =  F^'(z,T)  -  .1  Gfz,T).  For  the  characterization  of  Gfz,T)  given  by  (3.20), 
Hfz,  7)  converges  uniformly  to  0  for  z  >  0  as  c  i  0. 

Proof:  See  Appendix  B. 

Define  the  functional  ,  p^)  as 


P,)  =  E  Po  F,(pJPo.T)  (3.23) 

0. 

and  set  (if  they  exist) 
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(3.24) 


(Po>  Pi)  =  min  p^) 


(po,pl)  =  arg  min  J^Po^Pi)-  (3-25) 

Po’Pl 

We  point  out  that  p/{/  =  0,1)  can  also  depend  on  T.  The  following  theorem  establishes  conditions  under 
which  converges  as  €  1  0 . 

Theorem  2:  If 

Cl.  Pi  Pi  uniformly  on  flj.  as  e  1  0,  i  =0,1 
C2.  min  exists  then 

X 

Pi/Po  P\IPo  uniformly  on  X  as  e  1  0, 

JSPlPl)  JJPo^Pi)  =  2’-'  -  Pi  2) 

where  jS  =  (Tr^  Q"'. 

Proof:  It  is  elementary  to  show  that  Cl  and  C2  imply  (3.26).  Define  e{e)  =  sup  \p\lpl  -  p/^ol- 

X 

(3.26)  implies  e(f)  1  0  as  t  4  0  and 

PfPo  -  e{e)  <  pfpl  <  p,/Po  +  ^(0- 

Define  the  following  sets 

*5  =  {x  1  P/Po  >  2’} 

5,  =  {X  \p\ip^o  >  n 

=  (x  |p,/Po  +  e{€)  >  T} 

=  {x  Ip/Po  -  e{e)  >  T} 

^  =  {x  |p,/Po  =  2’} 

=  S:  -  A. 


(3.26) 

(3.27) 
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Because  of  Lemma  2  and  <  oo ,  we  can  write 


Pi)  - 


.  1 


Pi  _  y, 
Po 


0(f) 


=  E 


1  A*  f5‘ 

y  Pi  ■  Po 


0(e). 


Define  the  following  set  functions 


i-io  =  E 


-j  Pi  Po 


’/(O  =  E 


1  - 

jPx  -  Po 


where  C  e  Because  5/  ^  Q  S^  ,  it  follows  that 


Using  Cl -2  and  fi  (flj  <  oo,  it  is  straightforward  to  show  that 


=  viSJ  *  0(e) 


V\S)  =  v(S,)  *  0(e) 


=  7,(5/)  ^  0(e). 


Also  with  little  difficulty,  we  can  show 


lim  ri(S;)  =  r}{S) 
<»o 


and 


lim  r}(A)  = 
<10 


Now  because  A  =  4>  (the  empty  set),  then 


(3.29) 


(3.30a) 

(3.30b) 

(3.31) 

(3.32) 

(3.33) 

(3.34) 

(3.35a) 

(3.35b) 
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vis:)  =  n(A)  +  V(A). 


(3.36) 


However, 


r}(A)  =  Lp,  p^  -  Po  =0. 

To 


(3.37) 


Thus  ri{S*)  =  7}(A  )  and  lim  r}(S*)  =  i?(5)..  Using  this  result  and  (3.31)-(3.35),  we  see  that 

^  £i0 


lim  ri%S)  =  r;(5) 

eiO 


=  ^  P^(P„0,7)  -  PA0,7) 


(3.38) 


1  _  1 
T  TTo  Co 


C.  Minimization  Solution  Properties 

The  densities,  p^,  pl  are  defined  by  (3.25).  Using  Lemma  1,  for  p^  =  pl  and  p,  =  p\ 


Using  (3.11),  (3.17),  and  (3.39),  it  follows  that  (with  G  =  G,  (z,  T)) 


(3.39) 


Y^P^oG^  ‘1,T  >YPoG,  ^  ,  P  for  all  P,  €  T,. 


(3.40) 


In  similar  fashion,  since  Lemma  1  implies 


dH, 

-:r  - 

dv  .. .  n 


(3.41) 


it  follows  via  Lemma  3  that 


Q  ^  T  <  Y,  P  G  —  T 

n.  *  Po  *  Po 


(3.42) 


We  can  combine  (3.40)  and  (3.42)  to  obtain 
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(3.43) 


•j.  ~  Po 


P\ 


Po 


.  T 


fP^  -  Po 


—  ,  T 
Po 


Using  Lemma  3,  we  can  write 

E 


J  P\  -  Po 


—  ,  T 
Po 


=  E 


jP\-Po 


0(e) 


where  was  previously  defined.  Thus  under  the  conditions  of  Theorem  2, 

lira  E  (I  A'  -  G,  ^.T  .T-'-B  R,  P,.  i.  T). 

J  [Po 

We  now  upper  bound  the  right  side  of  (3.43).  Define 

A'""  =  E  Po  O,(pl/Po,  T). 

Q, 

We  know 

ir  Po-  Oie) 


where 


fl,.  =  |x  \pl/Po  >  T"  " 


and  n  is  a  positive  integer.  Assuming  condition  C2  of  Theorem  2,  then  pUpo  =  pJpQ  +  0(e). 
the  set 


Ke  =  \  X  \PjPo  ^  ^ 


and  the  set  functions 


V%D)  =  E  Po 

D 

y\\D)  =  E  Pi 


(3.44) 


(3.45) 


(3.46) 


Define 
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(3.47) 


where  D  G  Since 


iP  >  v%Bi)  +  0(e) 


In  similar  fashion,  if  we  define 


4*^  =  E  Pi(^  -  GSPliPl  D), 


(3.48) 


we  can  show 

>  i?'(C„;)  +  <7(6)  (3.49) 

where 


C* 


|x  Ip/Po  ^ 


Furthermore,  using  (3.47)  and  (3.49),  it  follows  that 


E 


GSP'M  .7)  =  -^  (1 


C)  - 


1 

T 


i,'(o  -  f(B:,)  +  0(6). 


(3.50) 


It  is  straightforward  to  show  that 


lim  lim  tj'(0  =  P^iPi,  0,  T) 

n-*oo  fiO 

(3.51) 

lim  lim  V(0  -P^Po,^,T)- 

n-^oa  €40 

(3.52) 

Thus  combining  (3.51),  (3.52)  with  (3.43)  and  (3.45),  it  follows  that  as  6  I  0, 


R,(P^,P,,^,T)  >  R,{P,,P,,^,T). 
We  summarize  the  preceding  results  in  the  following  theorem: 
Theorem  3:  Assume  the  pair  (po,  pi)  exists  such  that 


(3.53) 


13 


(3.54) 


(Po.  P\)  =  arg  min  F  El,  T 

’’0,  I  ^0  . 

for  all  0  <  f  <  €„„,  and  some  €^,  >  0.  Under  the  conditions  of  Theorem  2,  then  (P^ ,  P^)  is  a  least 
favorable  ^-dependent  pair. 

We  see  from  this  result  and  Theorem  1  that  under  fairly  general  conditions,  least  favorable  T- 
dependent  solutions  exist  (note  the  conditions  given  for  the  classes  in  Theorem  1  are  not  necessary).  In 
the  next  sections,  we  will  find  conditions  and  formulations  for  solutions  to  the  minimization  problem 
posed  by  (3.4)  when  t  >  0.  The  least  favorable  T-dependent  solution  will  result  in  the  limit  as  €  i  0. 

IV.  EXAMPLE:  DIVERGENCE  CLASS 

A.  Preliminaries 

In  this  section,  by  way  of  example,  we  present  a  methodology  for  finding  the  least  favorable 
T-dependent  pair  associated  with  3*(/  =  0,  1)  defined  as  divergence  classes.  The  divergence  [19]  of  two 
densities  p,  q  where  Q  >  >  P  is  defined  as 


D(p,q)  =  V)  P  In  (4.1a) 

We  define  the  two  hypotheses  classes  as 

T;  =  {P,  I  D(p,.p,*)  <  A,},  /  =  0,1  (4  Ib) 

where  p-  are  the  known  nominal  densities  of  //,  (i  =  0,  1),  P‘  >  >  P,,  and  A,  are  positive  real  numbers 
chosen  such  that  Pg  fl  Note  P,(/  =  0,  1)  are  convex  classes.  Conditions  for  f)  T,  =  <I>  are 

given  in  Appendix  C  for  Ag  =  A,.  The  divergence  class  (or  simply  div-class)  is  shown  not  to  necessarily 
be  2-alternating  capacitable  in  Appendix  D. 

The  restriction  of  the  support  to  have  a  finite  number  of  elements  has  two  significant  benefits. 
The  first  is  related  to  the  fact  that  in  practice  all  solutions  that  are  formulated  have  unknown  parameters 
that  are  solved  for  via  constraint  equations  and  the  digital  computer.  Hence  the  support  is  almost  always 
modelled  as  a  finite  set  in  order  to  solve  for  the  unknown  parameters  of  the  P-dependent  least  favorable 
pair  solution.  The  second  benefit  of  assuming  a  finite  support  set  is  that  one  may  use  the  powerful 
Kuhn-Tucker  convexity  theorem  for  finite  dimensions  [17]  that  guarantees  that  if  the  Lagrange  multiplier 
equation  is  solvable  then  the  solution  is  a  global  minimum.  In  addition,  if  the  minimum  exists  on  an  open 
subset  of  the  convex  probability  uncertainty  classes,  T,  then  the  Lagrange  multiplier  equation  is 
necessarily  solvable. 

For  this  problem,  a  support,  Q^,  need  not  be  explicitly  defined  since  none  of  the  constraint  equations 
depend  on  For  notational  purposes  we  define  an  index  set,  I  =  {1.2,  ...  ,  A}  where  K  is  the  number 
of  elements  of  and  K  ^  3.  Also  define 


Pi  =  {Pi  \D(p,,p')  < 


A,,  Pi  f  p'^},  /  =  0,  1  , 


(4.2a) 
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p]  =  {Pi  P(P,.P/*)  =  A  Pi  e  p")^  j  =  0, 1 


(4.2b) 


P  =  Po  Pv  P'  ""  Pl  Pi’  where  p'^  is  the  space  of  measures  probability  measures  on  a  set  of  K 
elements.  We  assume  that  p*ii  =  0, 1)  has  an  infinite  number  of  elements  which  will  be  true  if  >  3. 

K 

We  observe  that  p  ii  -  0,1)  is  a  closed  subset  of  hypercube  X  [0,1]  defined  on  R'^^.  Hence, 
because,  pfi  =  0, 1)  is  closed  and  bounded  then,  p,  and  p  are  compact.  Thus  Theorem  1  guarantees  that 
a  minimum  to  (3.4)  exists.  Define  p]  =  int  (pO  (int  ~  interior  of).  In  the  following,  we  assume  for  the 

least  favorable  T-dependent  solution  that  p^)  G  p]--  Thus  since  p]  is  an  open  subset  of 
p* ,  the  Lagrange  multiplier  equations  are  necessarily  solvable.  In  addition,  under  the  assumption 
(pQ,  Pi)  existing  on  p*,  then  (po.  Pi)  ^  P/  all  0  <  e  <  Cnu*  and  some  >  0.  Thus  the 
Lagrange  multiplier  equations  are  necessarily  solvable  for  (po,  pD.  0  ^  f  ^  ^max*  for  some  >  0- 

In  order  to  show  that  (p^,  p,)  is  a  least  favorable  P-dependent  pair,  we  must  show  that  the  conditions 
Cl  and  C2  of  Theorem  2  are  met.  After  (p^,  p^)  is  found  via  the  methodology  to  be  presented,  one  can 
check  and  see  if  (p,,,  p,)  G  pi  and  verify  condition  C2.  Finally,  we  note  with  respect  to  condition  Cl 
that  for  a  finite  discrete  support  space  of  x,  pointwise  convergence  on  implies  uniform  convergence 
on 

B.  Derivation 

For  notational  purposes,  we  write  F/z,  T)  =  F^z)  and  G,(z,  7)  =  G,{z).  Consider  the  minimization 
of  7(po,  Pi)  defined  by  (3.23)  with p,  C  p,  (i  =  0,  1)  defined  by  (4.2a).  Besides  the  divergence  inequality 
constraints  expressed  by  (4.2a),  the  densities  must  also  satisfy  the  total  mass  and  semi-positivity 
constraints.  Note,  all  the  constraint  functions  are  convex.  In  constructing  the  Lagrangian  for  this 
minimization  problem,  we  will  at  first  ignore  these  last  constraints  and  show  that  the  solution  obtained 
without  these  constraints  satisfies  the  semi-positivity  constraints  and  by  proper  normalization  can  be  made 
to  satisfy  the  total  mass  constraints. 

The  Lagrangian  for  this  minimization  problem  on  a  finite  support  is  given  by 


L  =  E  p'o  K 


pi 


Po 


(4.3) 


where  each  po  ,  pl  ,  Po*  ,  Pi*  is  indexed  with  respect  to  the  elements  of  I  and  sl{i  =  0,  1)  are  Lagrange 
multipliers.  We  have  superscripted  these  unknowns  to  indicate  that  they  are  functions  off.  We  will  do 
this  with  other  unknowns  as  well.  We  sum  over  these  indexed  elements  and  denote  this  by  .  The 

I 

Kuhn-Tucker  convexity  theorem  [17]  guarantees  a  global  minimum  on  convex  p  if  the  following  equations 
are  solvable  on  p: 
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dL 

Wo 


=  G. 


P\ 


Po 


Po 


^0  +  ^0  In  —  =  0 
Po 


(4.4a) 


dL 

Wi 


Po 


+  si  +  s,'  In  — !-  =  0 
P\ 


(4.4b) 


Dipl  ,P^)=A.,  a  =  0, 1) 


(4.5) 


where  oo  >  si  ^  0.  If  the  minimum  of  J  is  an  interior  point  ofp  then  (4.4)-(4.5)  are  necessary  and 
sufficient  conditions.  In  order  to  obtain  a  solution  we  assume  lim  s‘  =  s.  ^  0,  (i  =  0,  1)  and  check 

<to 

this  condition  afterward.  We  also  assume 

C3.  All  of  the  parameters  of  the  constraint  equations  which  are  functions  of  £  have  limits  as  £  *  0. 

Later  in  our  development,  we  will  show  conditions  under  which  C3  holds  for  this  particular  problem. 

Set  A'  =  pl/pl  and  A*  =  pi" /p^  ,  where  A'  and  A*  represent  the  solution  pair  and  nominal’s  likelihood 
ratios,  respectively.  Using  (4.4a)  and  (4.4b),  we  can  show 


-1.  G‘(A0  +  J.  F,(A‘)  +  In  A'  =  In  A‘.  (4.6) 

5o  .^1 


If  A‘  is  known  then 


pl  =  Pi*  exp  - 


+  1  f;(A<) 


(4.7) 


Po  =  Po'  exp  - 


1  C?,(A') 


(4.8) 


We  observe  that  pl  >  0,  (/  =  0, 1),  so  that  the  semi-positivity  constraint  is  met.  In  order  to  satisfy  the 
total  mass  constraints,  set 


Pl  =  clp’  exp 


1 

si 


FM‘) 


(4.9) 


Po  =  clpo  exp 


-  G,(A0 

i  * 


(4.10) 


16 


where  c/,  (/  =  0,  1),  are  positive  numbers  chosen  to  satisfy  the  total  mass  constraints.  Under  C3,  let 
c-  Cj  (/  =  0, 1)  as  e  *  0. 

Incorporating  these  constants,  (4.6)  becomes 

1  1  c‘ 

_L  (A‘)  +  -i  F,'(A0  +  In  A‘  =  In  A*  +  in  (4.11) 

^0  ^1  ^0 

Via  Lemma  3,  we  write 

f;(A0  =  1  G/AO  +  0(e)  (4.12) 

and  under  condition  C3  rewrite  (4.11)  as 

ofG(A^)  +  In  A'  +  0(e)  =  In  A*  +  In  fi  (4.13) 

^  c 

where  a  =  ((5*  r)"'  +  (Sq)"').  Let  a  e  1  0. 

We  now  solve  for  (p^,  p,)  for  the  three  distinct  cases:  A‘  G  fl,,  where  fl,,  are 
defined  by  (3.20). 

Case  /.  A*  G 

For  this  case  Gj(A‘)  =  0  and  (4.13)  becomes 

In  A‘  +  0(e)  =  In  A*  +  In  .1.  (4.14) 


As  6  i  0,  then  A‘  -»  (c/c^)  A*  and  (4.9)-(4.10)  become 

Pi  =  e,  Pi* 

Po  =  ^^0  Po* 

for  pJPq  <  T  or  equivalently  c,  p’  /(Cq  Po)  <  T. 

Case  2:  A‘  e 

For  A'  e  G/AO  =  1  and  (4.13)  becomes 


(4.15) 

(4.16) 
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(4.17) 


o'  +  In  A'  +  0(()  =  In  A’  +  In  — . 

Co‘ 

As  €  i  0,  A‘  -*  (c,/Cq)  A'  and  (4.9)-(4.10)  becomes 

1 

Px  =  ^  Px' 

X 

Po  =  V  Po 

for  pJpQ  ^  T  or  equivalently  c,  Px’KCqPo)  ^ 

Owe  3:  A*  €  Qj 

For  this  case  (4.13)  becomes 


_  (A‘  -  7)  +  In  A‘  +  0(e)  =  In  A* 
e 


Set  A‘  =  AA  +  T  where  AA  >  0.  Rewrite  (4.20)  as 


In 


+ 


AA 

T 


^  AA  =  In  il_  +  In  +  0(e). 
e  T 


A* 


Co 


Now 


In 


1  + 


AA 


AA 


O(AA^). 


Using  (4.21)  and  (4.22),  it  can  be  shown  that 


AA 


1 


1  a 

_  +  _ 

r  e 


In  +  In  +  0(e)  +  O(AA^) 

7'  c^ 


Because  G^(A‘)  =  AA/e,  it  follows  from  (4.23)  that 


(4.18) 


(4.19) 


(4.20) 


(4.21) 


(4.22) 


(4.23) 
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(4.24) 


(4.25) 


(4.26) 


where  X  ^  (s^ar  and  T  <  c,  p,' l{c,  p,*)  <  Te«.  Let  V  =  {s^a^Y'  and  -  X  as  £  i  0  under 
condition  C3. 


Hence,  we  see  that  (po,  Pi‘)  ^  (Po>  P\'>  pointwise  (or  in  this  case,  uniformly  on  the  finite  space,  Q,) 
as  £  4  0  under  the  conditions  that  1)  a  solution  exists  for  the  unknown  parameters  of  the  densities, 
(Pq,  p,),  which  are  found  via  the  constraint  equations,  2)  (p^,  p,)  G  int  (p")  where  p*  is  defined  by 
(4.2b),  and  3)  condition  C3  holds. 

C.  Conditions  on  the  Solution 

Now  that  we  have  the  solution  form  for  p^  and  Pj,  we  can  give  readily  checked  conditions  under 
which  condition  C3  can  be  verified. 

Lemma  4:  C3  is  true  (i.e.  all  the  parameters  of  the  constraint  equations  have  limits  as  £  4  0)  if 


C4.  a)  c^,'/(CoPo*)  5^  r  or  Te’^  for  all  x  G  and  b)  the  Jacobian  of  the  constraint  equations  is 
non-zero. 

Proof:  Condition  C4  exemplifies  the  conditions  associated  with  the  Inverse  Function  Theorem  [20];  the 
constraint  equations  must  be  continuously  differentiable  and  the  Jacobian  of  the  constraint  equations  must 
be  non-zero  in  order  that  the  constraint  equations  are  invertible  in  the  neighborhood  of  c„,  c,,  a,  and 
X.  It  C4a  is  true,  it  is  straightforward  to  show  that  the  constraint  equations  are  continuously  differentiable. 

Set  y  =  (co‘,  c[,  c^,  X‘)  and  y,  =  (c^,  c„  a,  X)  and  let  z,  =  /(y),  (i  =  1,  2,  3,  4)  denote  the  four 
constraint  equations  with  f  (yo)  =  0.  Using  the  previous  development  of  the  derivation  of  p^  and  p^,  it 
can  be  shown  that 
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c!  p:  ^  0(e) 


;  c\p:I(c^^p‘)  <  T  +  0(e) 


Pi  = 


Po  = 


c,"'co'-'‘r'-^>,*  +  0(e) 

;  T  +  0(e)  <  clp:i(c^^p-)  <  Te 

c,‘c-<‘-^^p,‘  +  0(e) 

:  c!  Pi'/(Co  Po‘)  ^  Te^  +  0(e) 

^0  Po  *  ^(c) 

i  P\  K^o  Po  )  ■^  T  +  0(e) 

+  0(e) 

;  r  +  0(e)  ^  cl  p,*/(Co  Po')  <  TV" 

Co‘e"‘">o*  +  0(e) 

»  ^i  P\  1(^0  Po)  ^  TV"  +  0(e) 

(4.28) 


(4.29) 


where  the  order  terms  added  to  each  Tand  Te'^,  respectively  are  identical.  Examining  (4.28)  and  (4.29), 

it  is  found  that  expressions  for  pl(i  =  1,  2)  at  the  boundaries  of  the  regions  of  applicability  are  within 
0(e).  Thus  (4.28)  and  (4.29)  can  be  rewritten  such  that  the  order  terms  do  not  appear  in  the  regions  of 

applicability  but  are  incorporated  as  order  terms  in  the  expressions  for  pl(i  =  0,  1). 

If  these  solutions  are  substituted  into  the  constraint  equations,  it  is  found  that/<y)  =  0,(e)  (/  =  1, 
2,  3,  4)  where  we  have  subscripted  each  ordered  term  to  indicate  its  distinctness.  We  point  out  that  each 
constraint  equation  has  three  summations,  each  taken  over  one  of  the  three  regions  defined  by  (4.28)  or 
(4.29).  Condition  C4a  guarantees  that  a  term  appearing  in  one  of  the  summations  will  not  jump  to  another 
summation  for  arbitrarily  small  perturbations  about  yo.  This  will  also  be  true  for  the  first  and  second 
derivitives  of  f,{i  =  1,  2,  3,  4).  Under  C4a,  it  can  be  shown  that  each  term  of  each  summation  of  f,(i  = 
1,  2,  3,  4)  is  continuously  differentiable  and  hence /  is  continuously  differentiable.  Hence  under  C4  and 

the  Inverse  Function  Theorem,  the  solution  for  (Co,  cl,  o',  X‘)  exists  in  the  neighborhood  of c^,  c,,  a,  X 

for  arbitrarily  small  e  and  lim  (cj,  cl,  o',  X‘)  =  (Cq,  c„  a,  X).n 

(10 

We  point  out  that  it  is  highly  likely  that  C4  is  true. 

The  condition  that  oo  >  ^  0  is  equivalent  to  the  condition  a  >  0  and  0  <  X  <  1 .  This  can 

be  shown  via  the  equations:  a  =  (s'J)"'  +  5o'  ^nd  X  =  (^ga)  '.  Because  po*  >  0  and  is  compact,  it 
follows  that  condition  C2  of  Theorem  2  holds.  Under  condition  C4  and  the  preceding  development  we 
see  that  condition  Cl  of  Theorem  2  holds.  Thus  we  can  state: 

Theorem  4:  The  least  favorable  T-dependent  pair  for  the  divergence  class  discrimination  hypothesis  testing 
problem  is  given  by  the  following  densities  under  the  conditions  1)  that  a  solution  exists  for  the  unknown 
parameters  of  the  densities  which  are  found  via  the  constraint  equations  (and  the  total  mass  constraints), 

2)  (pQ,  P,)  ^  int  (p‘)  where  p*  is  defined  by  (4.2b)  3)  C4  holds  and  4)  a  >  0,  0  <  X  <  1: 
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Pi* 

X  l-Xy^l-X  ‘X  »(1-X) 

^0  ■'  Pi  Po 


;  Cj  p,*/(CoPo*)  < 

;  r  <  c,  pr/(Co  Po*)  < 
;  c,  Pi  Kc^Po)  ^  Te“ 


Po  = 


<^oPo 


Coe^“Po* 


;  c.pil(c^Po)  <  T 


]  T  <  c.  Pi  tic.  Po)  <  re“ 


;  C;  p,*/(Co  Po*)  ^  7’'^“ 


where  c  ,  c  ,  X,  a  are  the  unknown  parameters  to  be  determined  from  the  four  constraint  equations. 


^0’ 

The  likelihood  ratio  p,/po  is  given  by 


P\  li^o  Po) 


A 


T 


l^e'V,  p,*/(Co  Po) 
The  decision  rule  is  given  by 


;  c,  p,*/(Co  Po*)  <  T 

■,  T  <  p;/(Co  Po*)  <  r<?" 

;  c,  pi l{c^  pi)  ^  Pe“. 


(4.30) 


0 


1 


Pi*/Po*  <  Core“/c, 

Pi'/po*  >  Cje^lc^. 


(4.31) 


D.  Calculation  of  Pd,  Pp 

Let  Pd\T)  and  P/(7)  be  the  probabilities  of  detection  and  false  alarm  for  P,*  ~  and 
Po*  ~  Hq,  respectively  of  the  nominal  decision  rule  with  threshold,  T.  Using  (4.31) 
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Po(P,,  0.  7)  =  Prob 


^1  Pi 

I  ^  -a  ^  *  >7^ 


^0  Po 


=  C,  g p^ 


CoTe” 


(4.32) 


Similarly,  it  can  be  shown 


P^P^,  7)  =  Cq  P; 


(4.33) 


Thus  for  any  (P^,  P,)  G  T,  using  (3.53),  (4.30),  and  (4.31) 


Pg(Po,  7)  <  TTo  q,c/^  p; 


CoPe“ 


1  -  C,  Pn 

(4.34) 


Hence  knowing  T,  c^,  c,,  X,  Pj (•).  and  P/  (•)  allows  us  to  find  the  upper-bound  on  Bayes  risk  over  the 
uncertainty  classes  of  the  two  hypotheses. 

V.  EXAMPLE;  DIVERGENCE/LINEAR  CLASS 

The  results  of  the  previous  section  can  be  readily  extended  to  a  class  we  call  the  Divergence/Linear 
class  (or  simply  D/L  class).  In  general,  the  two  hypothesis  classes  are  defined  as 

=  {P,  I  P>(p,.  p,‘)  <  A,;  hj\)p^(\)dfi  =  /«  =  1,2, ...M};  /  =  0,1  ,  (5.1) 

where  /i,„  are  known  functions  of  the  elements  of  x  and  the  are  specified.  For  example,  if  h,„(x)  is  a 
multidimensional  monomial  of  the  elements  of  x  then  corresponds  to  a  moment  of  these  elements.  We 
again  assume  f|  T,  =  <I>.  In  addition,  we  assume  that  the  constraints  indicated  by  (5.1)  are  regular 
[17]  on  flj, .  Because  the  constraints  are  convex  functions  of  p,  (i  =  0,  1)  it  is  straightforward  to  show  that 
T,  (/  =  0,  1)  are  convex  sets.  It  is  shown  in  Appendix  D  that  the  D/L  class  is  not  necessarily  2-alternating 
capacitable. 


22 


As  in  the  preceding  sections,  we  restrict  to  have  a  finite  number  of  elements.  However  in  this 
case,  due  to  the  moment  constraints,  a  support  must  be  specified.  Set  =  {xi,  Xj,  ...,  x^^-}  where  there 
are  K  elements  in  the  support  and  Xi^(k  —  1,  2,  ...,  ^  is  an  A-length  vector  and  K  >  M  +3.  Define 


Pi  =  {Pi  I  P-‘)  ^  E  K  Pi 


= 


m 


l,2,...,M;p.  €  y*'},  i  =  0,1  (5.2a) 


P]  =  {Pi  I  Dip,,  Pi’)  =  A,.;  p,  =  c^;  m  =  1,2, 


,M;p,  €  0^},  t  =  0,1.  (5.2b) 


We  assume  p,'^  contains  an  infinite  number  of  elements  which  will  be  true  if  AT  >  M  +  3.  Again  (as  in 
the  previous  section),  p  and  p'^  are  compact  and  Theorem  1  guarantees  that  the  minimum  to  (3.4)  on  p 
or  p^  exists.  As  before  we  assume  the  least  favorable  T-dependent  densities  is  on  the  interior  of  p*  or 

(p,,  pj)  €  int  (p*). 

The  Lagrangian  for  the  minimization  problem  posed  by  (3.3)  and  the  class  given  by  (5.2a)  (or  5.2b) 
is 


E  Po  p 


a 

Po 


n  ^ 

+  So  E  Po  In  _  +  5i  2^  p,  In  _ 
^  Po  ‘  Pi 


(5.3) 


+  E  E  Po  +  KK  Pi) 

I  m=l 

where  s,  X,„  are  the  Lagrange  multipliers  (t  =  0,  V,  m  =  1,  2,  ...  ,  M)  and  each  po,  p\, 
Po  .  P\  >  nr®  indexed  with  respect  to  the  elements  of  /. 

Using  the  methodology  of  the  previous  section  it  is  straightforward  to  show: 

Theorem  5:  The  least  favorable  T-dependent  pair  for  the  divergence/linear  class  discrimination  problem 
is  given  by  the  following  densities  under  the  conditions  that  1)  a  solution  exists  for  the  unknown 
parameters  of  the  densities  which  are  found  via  the  constraint  equations  ((4.5),  total  mass  constraints,  and 

moment  constraints),  2)  (pg,  p,)  G  int  ip*)  where  p^  is  defined  by  (5.2b),  3)  C5  holds  (given  below) 
and  4)a>0,  0<X<  1: 

Define  V  (which  is  indexed  by  1)  as 


V  = 


Pi 

Po 


exp 


E  ^^InPlm  ^OnPom) 


and  the  sets 
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flv.  =  {^|V<  T} 

=  {k\T  <  V  <  7e“} 

Qy  =  {A:|V  >  re“} 

=  {^|V  >  Te-^}. 

Condition  C5  (which  is  an  extension  of  Lemma  4)  is 
C5.  a)  V  7^  T  or  Tef'  for  all  x  €  and 

b)  the  Jacobian  of  the  constraint  equations  is  non-zero. 
The  densities  are  given  by 


Px  = 


cj}y  exp 


^Im  ^Im 


m  =  l 


X  I  _x  •  X  •  (1  -X) 

c,  Co  r  >1  Pq  exp 


-  E  (d  - 


m  =  l 


c,c  <i  '')"p,*  exp 


E  «lm'' 


Im 


;  k  e  U, 


;  k  E  Qy 


;  k  E  Qy 


Po  = 


CoPo  exp 


-  E  "o. 


On 


/n»l 


X  1  -X  <7“»  _x  •  X  •  ( 1  “X) 

CqCo  T  V,  Po  exp 


E  [(1  -  ^)«On'»CV„  ^ 


c^e^Po  exp 


E  « J', 


Om 


m«l 


/t  E  fi., 


it  G  fl., 


;  k  E  Qy 


where  c„  X,  a,  (/  =  0,  1;  m  =  1,  2,  ...  ,  Af)  are  unknown  parameters  to  be  determined. 


24 


The  likelihood  ratio  of  is  given  by 


k  e 

'o 

k  G 

it  e  n„ 


The  decision  rule  is  given  by 


(5.4) 


0  ;  k  G  Qv,  (5.5) 

1  ;  k  gq;^ 


where  c  denotes  the  complement  set.  We  note  that  Theorem  4  is  actually  a  special  case  of  Theorem  5 
because  the  total  mass  constraints  can  be  written  as  linear  constraints  with  hj,  =  1,  (i  =  0,  1). 

VI.  SUMMARY 

We  have  presented  a  methodology  for  finding  robust  detectors  for  composite  binary  hypotheses 
defined  for  uncertainty  classes  which  are  not  necessarily  2-alternating  capacitable.  A  robust  detector  is 
defined  as  a  detection  structure  whose  performance  measures  are  sharply  lower  and/or  upperbounded  for 
given  input  uncertainty  classes.  Past  robust  detection  schemes  have  been  threshold  independent.  The 
robust  test  reduced  to  a  test  between  two  simple  hypotheses  whereby  the  underlying  probability  measures 
were  fixed  representatives  of  the  specified  uncertainty  classes  and  were  independent  of  the  test’s 
threshold.  In  this  paper,  we  presented  conditions  and  formulations  for  detection  structures  which  can  be 
threshold  dependent,  and  which  sharply  upper-bound  the  Bayes  risk  for  the  chosen  detector  threshold. 
The  support  set  was  assumed  to  have  a  finite  number  of  elements.  The  robust  detector  structure  resulted 
from  solving  an  associated  limiting  minimization  problem.  It  was  shown  that  the  robust  test  again  reduces 
to  a  test  between  two  simple  hypotheses  whereby  the  underlying  probability  measures  were  fixed 
representatives  of  the  specified  uncertainty  classes.  However,  these  probability  measures  can  be  a  function 
of  the  detector’s  threshold.  Results  on  the  existence  of  these  solutions  were  presented  and  solutions  for 
the  divergence  and  divergence/linear  uncertainty  classes  were  formulated. 
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Appendix  A 
PROOF  OF  LEMMA  2 


The  regions  Qi,  and  ^2  are  defined  by  (3.20).  If  <  T+e,  the  following  arguments  are  easily 
modified  to  obtain  the  proof.  Thus  we  assume  >  T  +  e.  The  functions.  F^z,  T)  and  Fo(z,  7)  are 
defined  by  (3.21)  and  (3.22),  respectively.  Define 


DSZ,  T)  =  F/z,  T)  -  F,{z,  T)  .  (Al) 

To  demonstrate  that  F^z,  T)  -*  Fo(z,  T)  uniformly  on  [0,  z^^]  as  €  1  0,  we  show  that  sup  |  D,(z,  7)  | 

Z 

^  0  as  e  I  0  in  each  of  the  regions;  0<z<T  and  T<z<  (hi  denotes  absolute  value). 

A.  For  z  ^  T,  D,{z,  7)  =  0  and  it  trivially  follows  that  sup  1  D,(z,  7)  j  =  0. 

z 

B.  For  z  >  T,  set  z  =  7’  +  A  and  choose  e  <  A.  For  z  G  [T  +  e,  z^axl 


D,(z,  T) 


z 

In 

1  + 1 

e 

€ 

T 

T 

(A2) 


Since  D/z,  T)  is  linear  with  a  negative  slope  and  zero  intercept,  the  maximum  occurs  at  Zmax-  It  ‘s 
straightforward  to  show  [  DXz„m^  T  \  —  0(e).  Hence  sup  |  D^Z,  T)  |  ^  0  as  e  I  0  for  z  >  T. 


27 


Appendix  B 
PROOF  OF  LEMMA  3 


The  regions  Qq.  ^i.  ^2  and  the  function  G/z,  7)  are  defined  in  Eq.  (3.20).  Now 


G,(0,  T) 


rf/3 


G'.a,  T) 
z 


Define 


//,(z.  T)  =  F/(z.  T)  -  I  G,(z.  T)  . 

To  demonstrate  that  //,(z,  7)  converges  uniformly  to  0  for  z  >  0  we  show  that  sup  |  //,(z,  7) 

z 

f  i  0  in  each  region  z  <  7’,  z  =  T,  z>T. 

A.  For  z  <  T,  H,(z,  7)  =  0  and  it  trivially  follows  that  sup  |  //,(z,  7)  |  =  0  in  this  region. 

z 

B.  For  z  G  n,  we  can  show 

H{Z,  T)  =  I  In  i  -  _L  (z  -  D  . 
e  T  (T 

Thus  H,  (T,  T)  =  0. 

C.  For  z  >  r,  we  set  z  =  7’  +  A  and  choose  e  <  A.  Thus  z  G  fl:  and  it  can  be  shown 

F'(z,  7')  =  I  In  . 

f  T 

Now  77, (z,  7)  is  independent  of  z  and 

F/(z,  T)  -  ‘  G/z,  D  =  I  In  ‘  =  G(0  . 

T  ‘  e  T  T 

Thus  I  HXz,  T)  I  is  uniformly  convergent  to  0  for  z  >  T. 


(Bl) 

(B2) 

I  -►  0  as 

(B3) 

(B4) 

(B5) 
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Appendix  C 

CONDITION  FOR  NON-OVERLAPPING  DIVERGENCE  CLASSES 


Let  A,  =  Aq  =  A.  We  wish  to  find  the  minimum  A  such  that  there  exists  a  pdf  p,  satisfying 


E  ^  In  ^  ^  ^  =  0,  1 

n.  Pi 


(Cl) 


It  is  straightforward  to  show  that  this  is  equivalent  to  finding 


mm 


in  E  P  in  ^  /  =  0,  1 


P  fi. 


Pi 


(C2) 


subject  to  the  constraint 


E  in  ^  =  E  in  -4 


Po  \ 


(C3) 


It  can  be  shown  that 


Po 


Pi 

lie  i  lie 

P*  Po 

a. 


(C4) 


where  s  is  determined  from  the  constraint  equation,  Eq.  (C3).  Substituting  and  simplifying  implies  5  is 
the  solution  of 


Set 


-  E  p* p:  ‘ i 

0,  Po 


(C5) 


(C6) 


We  notey(5)  is  continuous  and  monotonically  decreasing,  on  the  internal  s  £  (0,  1],  In  addition, y(0)  >  0 
andy(l)  <  0.  Thus  a  solution  exists  and 


A,nin  =  -in  E  Pi""  Po* 

Hence  for  A  <  A,,,;,,  the  divergence  uncertainty  classes  do  not  overlap. 


(C7) 
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Appendix  D 

THE  DIVERGENCE  OR  DIVERGENCE/LINEAR  CLASSES 
NEED  NOT  BE  2-ALTERNATING  CAPACITABLE 


We  need  only  consider  the  D/L  classes  since  the  div-class  is  embedded  in  the  D/L  class.  We  can 
construct  a  D/L  class  that  has  only  one  member  in  it.  Because  this  class  is  trivially  2-alternating 
capacitable,  we  cannot  be  definitive  and  say  no  D/L  class  is  2-alternating  capacitable. 

Assume  that  P,  is  defined  by  Eq.  (5.1)  and  that  this  class  is  2-alternating  capacitable.  Specifically, 
let  !P,  be  div-class  with  finite  support  such  that  AT  ^  2.  Let  Tq  have  only  one  member,  Pq,  (and  thus  is 
2-alternating  capacitable)  such  that  Pq  ^  3*,.  It  is  straightforward  to  modify  Huber  and  Strassen’s  [3] 
Theorem  6. 1  to  show  that  if  F  is  any  twice  continuously  differentiable  function  on  (0,  oo)  and  Po>>  P, 
then  the  least  favorable  pair  (in  the  sense  of  Huber  and  Strassen),  each  of  which  is  2-alternating 
capacitable,  minimizes  J(po,  p,)  which  defined  by  (3.3).  The  resultant  pair  is  functionally  independent  of 
the  choice  of  F.  Consider  the  solutions  for  p,  under  F  =  F^  =  —  In  z  and  F  =  F2  =  z\n  Z-  Both  F,  and 
F2  are  convex  and  twice  continuously  differentiable  on  (0,  00).  For  Fj,  the  solution  exists  and  is  given 
by  Blahut  [21].  Under  F,,  it  can  be  shown  that  a  solution  exists  for  some  £,.  However,  the  solutions  under 
F,  and  F2  are  not  identical.  Hence  T,  is  not  capacitable. 
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